library(bayesrules)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.4 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 2.0.1 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
# E(π)=α/(α+β)≈0.40, so α/β≈2/3
# π is between 0.2 and 0.6
# In a trial and error process, I use plot_beta() in the bayesrules package to plot Beta models with α and β pairs that meet this constraint (e.g., Beta(2,3), Beta(4,6), Beta(8,12)). Among these, I find that the Beta(16,24) closely captures the typical outcomes and variability in the estimated chances.
plot_beta(16, 24)
Beta(16, 24)
# E(π)=α/(α+β)≈0.80, so α/β≈4
# Var(π)=αβ/((α+β)^2(α+β+1)=0.05, so β=11/25, α=44/25
plot_beta(44/25, 11/25)
Beta(44/25, 11/25)
# E(π)=α/(α+β)≈0.90, so α/β≈9
# π is between 0.85 and 1
# In a trial and error process, I use plot_beta() in the bayesrules package to plot Beta models with α and β pairs that meet this constraint (e.g., Beta(9,1), Beta(27,3), Beta(45,5)). Among these, I find that the Beta(16,24) closely captures the typical outcomes and variability in the estimated chances.
plot_beta(720,80)
Beta(720, 80)
# According to Sal, since she had no estimate based on the interviewer’s expression, it’s equally plausible for π to take on any value between 0 and 1. we can model π by the standard Uniform model Unif(0,1) which is a special case of Beta(α,β) when α=β=1.
plot_beta(1,1)
Beta(1,1) or Unif(0,1)
# E(π)=α/(α+β)≈0.80, so α/β≈4
# π is between 0.70 and 0.90
# In a trial and error process, I use plot_beta() in the bayesrules package to plot Beta models with α and β pairs that meet this constraint (e.g., Beta(4,1), Beta(8,2), Beta(12,3)). Among these, I find that the Beta(180,45) closely captures the typical outcomes and variability in the estimated chances.
plot_beta(180,45)
Beta(180,45)
# E(π)=α/(α+β)≈0.90, so α/β≈9
# Var(π)=αβ/((α+β)^2(α+β+1)=0.08, so β=1/80, α=9/80
plot_beta(9/80,1/80)
Beta(9/80, 1/80)
# E(π)=α/(α+β)≈0.85, so α/β≈17/3
# π is between 0.75 and 0.95
# In a trial and error process, I use plot_beta() in the bayesrules package to plot Beta models with α and β pairs that meet this constraint (e.g., Beta(17,3), Beta(34,6), Beta(51,9)). Among these, I find that the Beta(204,36) closely captures the typical outcomes and variability in the estimated chances.
plot_beta(204,36)
Beta(204,36)
# E(π)=α/(α+β)≈0.3, so α/β≈3/7
# The variability in π is high for the Beta(α,β) model since Ben is pretty unsure about that guess, which makes α and β smaller.
# In a trial and error process, I use plot_beta() in the bayesrules package to plot Beta models with α and β pairs that meet this constraint (e.g., Beta(1.5,3.5), Beta(3,7), Beta(4.5,10.5)). Among these, I find that the Beta(4.5,10.5) closely captures the typical outcomes and variability in the estimated chances.
plot_beta(4.5,10.5)
Beta(4.5,10.5)
# Since it’s equally plausible for π to take on any value between 0 and 1, we can model π by the standard Uniform model Unif(0,1) which is a special case of Beta(α,β) when α=β=1. So the appropriate Beta prior model is Unif(0,1) or Beta(1,1).
plot_beta(1,1)
The mean of the Beta prior that I specified in part a is 0.5. I think this aligns with having no clue because when α and β are equal, the Beta pdf is symmetric around a common mean of 0.5. The mean of the beta provides one measure of central tendency or what’s typical. When E(π)=0.5, it indicates no inclination toward lower or higher values of parameter π between 0 and 1 since the average value of 0 and 1 is 0.5. don’t know anything other than the probability falls between o and 1. every value between o and 1 is equally plausible
summarize_beta(1,1)
## mean mode var sd
## 1 0.5 NaN 0.08333333 0.2886751
The standard deviation of the Beta prior that I specified is 0.2886751.
summarize_beta(100,100)
## mean mode var sd
## 1 0.5 0.5 0.001243781 0.03526728
plot_beta(100,100)
Beta(100,100). Its standard deviation is 0.03526728.
summarize_beta(0.1,0.1)
## mean mode var sd
## 1 0.5 0 and 1 0.2083333 0.4564355
plot_beta(0.1,0.1)
Beta(0.1,0.1). Its standard deviation is 0.4564355.
(a): Beta(0.5,0.5) (b): Beta(2,2) (c): Beta(6,2) (d): Beta(1,1) (e): Beta(0.5,6) (f): Beta(6,6)
(a): Beta(1,0.3) (b): Beta(3,3) (c): Beta(4,2) (d): Beta(2,1) (e): Beta(5,6) (f): Beta(6,3)
knitr::include_graphics("image/beta_1.png")
As illustrated in the image (the means of π are represented by the blue solid vertical lines), α and β of the Beta are equal in (a), (b), (d), and (f), so the Beta pdfs are all symmetric around a common mean of 0.5. Also, since the Beta pdf is left skewed in (c), the mode exceeds the mean of π, and both are above 0.5. By contrast, since the Beta pdf is right skewed in (e), the mean exceeds the mode of π, and both are below 0.5. Therefore, Beta(0.5,6) in (e) has the smallest mean and Beta(6,2) in (c) has the biggest mean.
summarize_beta(0.5,0.5)
## mean mode var sd
## 1 0.5 0 and 1 0.125 0.3535534
summarize_beta(2,2)
## mean mode var sd
## 1 0.5 0.5 0.05 0.2236068
summarize_beta(6,2)
## mean mode var sd
## 1 0.75 0.8333333 0.02083333 0.1443376
summarize_beta(1,1)
## mean mode var sd
## 1 0.5 NaN 0.08333333 0.2886751
summarize_beta(0.5,6)
## mean mode var sd
## 1 0.07692308 0 0.009467456 0.09730085
summarize_beta(6,6)
## mean mode var sd
## 1 0.5 0.5 0.01923077 0.138675
The mean of Beta(0.5,0.5), Beta(2,2), Beta(1,1), Beta(6,6) are 0.5, the mean of Beta(6,2) is 0.75, and the mean of Beta(0.5,6) is 0.07692308.
knitr::include_graphics("image/beta_2.png")
As illustrated in the image above (the modes of π are represented by the blue dashed vertical lines), the mode of the Beta in (a) is {0,1}, the mode of the Beta in (b) is 0.5, the mode of the Beta in (c) is between 0.5 and 1, the mode of the Beta in (d) is any value of π in [0,1], the mode of the Beta in (e) is 0, and mode of the Beta in (f) is 0.5. Therefore, Beta(0.5,0.5) and Beta(0.5,6) in (a) and (e) have the smallest mode; and Beta(0.5,0.5) in (a) and (d) has the biggest mode.
summarize_beta(0.5,0.5)
## mean mode var sd
## 1 0.5 0 and 1 0.125 0.3535534
summarize_beta(2,2)
## mean mode var sd
## 1 0.5 0.5 0.05 0.2236068
summarize_beta(6,2)
## mean mode var sd
## 1 0.75 0.8333333 0.02083333 0.1443376
summarize_beta(1,1)
## mean mode var sd
## 1 0.5 NaN 0.08333333 0.2886751
summarize_beta(0.5,6)
## mean mode var sd
## 1 0.07692308 0 0.009467456 0.09730085
summarize_beta(6,6)
## mean mode var sd
## 1 0.5 0.5 0.01923077 0.138675
The mode of Beta(0.5,0.5) is {0,1}, the mode of Beta(2,2) is 0.5, the mode of Beta(6,2) is 0.8333333, the mode of Beta(1,1) is any value in [0,1], the mode of Beta(0.5,6) is 0, and the mode of Beta(6,6) is 0.5.
knitr::include_graphics("image/beta_3.png")
As indicated in the image above (the standard deviations of π are represented by the orange solid vertical lines), Beta(0.5,6) in (e) has the smallest standard deviation. Beta(0.5,0.5) in (a) has the biggest standard deviation.
summarize_beta(0.5,0.5)
## mean mode var sd
## 1 0.5 0 and 1 0.125 0.3535534
summarize_beta(2,2)
## mean mode var sd
## 1 0.5 0.5 0.05 0.2236068
summarize_beta(6,2)
## mean mode var sd
## 1 0.75 0.8333333 0.02083333 0.1443376
summarize_beta(1,1)
## mean mode var sd
## 1 0.5 NaN 0.08333333 0.2886751
summarize_beta(0.5,6)
## mean mode var sd
## 1 0.07692308 0 0.009467456 0.09730085
summarize_beta(6,6)
## mean mode var sd
## 1 0.5 0.5 0.01923077 0.138675
The standard deviation of Beta(0.5,0.5) is 0.3535534, the standard deviation of Beta(2,2) is 0.2236068, the standard deviation of Beta(6,2) is 0.1443376, the standard deviation of Beta(1,1) is 0.2886751, the standard deviation of Beta(0.5,6) is 0.09730085, and the standard deviation of Beta(6,6) is 0.138675.
plot_beta(0.5,0.5)
plot_beta(2,2)
plot_beta(6,2)
plot_beta(1,1)
plot_beta(0.5,6)
plot_beta(6,6)
summarize_beta(0.5,0.5)
## mean mode var sd
## 1 0.5 0 and 1 0.125 0.3535534
summarize_beta(2,2)
## mean mode var sd
## 1 0.5 0.5 0.05 0.2236068
summarize_beta(6,2)
## mean mode var sd
## 1 0.75 0.8333333 0.02083333 0.1443376
summarize_beta(1,1)
## mean mode var sd
## 1 0.5 NaN 0.08333333 0.2886751
summarize_beta(0.5,6)
## mean mode var sd
## 1 0.07692308 0 0.009467456 0.09730085
summarize_beta(6,6)
## mean mode var sd
## 1 0.5 0.5 0.01923077 0.138675
summarize_beta(8,2)
## mean mode var sd
## 1 0.8 0.875 0.01454545 0.1206045
summarize_beta(1,20)
## mean mode var sd
## 1 0.04761905 0 0.002061431 0.04540298
plot_beta(8,2)
plot_beta(1,20)
# The first salesperson who works in North Dakota
summarize_beta_binomial(8,2, y = 12, n = 50)
## model alpha beta mean mode var sd
## 1 prior 8 2 0.8000000 0.8750000 0.014545455 0.12060454
## 2 posterior 20 40 0.3333333 0.3275862 0.003642987 0.06035716
# The second salesperson who works in Louisiana
summarize_beta_binomial(1,20, y = 12, n = 50)
## model alpha beta mean mode var sd
## 1 prior 1 20 0.04761905 0.000000 0.002061431 0.04540298
## 2 posterior 13 58 0.18309859 0.173913 0.002077410 0.04557861
# The first salesperson who works in North Dakota
plot_beta(8,2)
plot_binomial_likelihood(12,50)
plot_beta_binomial(8,2,12,50)
# The second salesperson who works in Louisiana
plot_beta(1,20)
plot_binomial_likelihood(12,50)
plot_beta_binomial(1,20,12,50)
c) For the first salesperson who works in North Dakota after observing that 12 out of 50 U.S. residents prefer the term “pop” from the poll, he/she become less confident in the plausibility of a high proportion of U.S. residents that call a sweet carbonated drink “pop” than before; for the second salesperson who works in Louisiana after observing that 12 out of 50 U.S. residents prefer the term “pop” from the poll, he/she become more confident in the plausibility of a high proportion of U.S. residents that call a sweet carbonated drink “pop” than before.
Overall, the first salesperson who works in North Dakota is still more confident in the plausibility of a high proportion of U.S. residents that call a sweet carbonated drink “pop” than the second salesperson who works in Louisiana is. In the posterior model, the first salesperson’s average estimate of the plausibility of a proportion of U.S. residents that call a sweet carbonated drink “pop” is around 0.34 while the second salesperson’s average estimate of the plausibility of a proportion of U.S. residents that call a sweet carbonated drink “pop” is around 0.18.
# Since E(π)=α/(α+β)=1/4, Mode(π)=(α-1)/(α+β-2)=5/22, α=6, β=18
plot_beta(6, 18)
Thus prior Beta model is Beta(6,18).
plot_beta_binomial(alpha = 6, beta = 18, y = 15, n = 50)
summarize_beta_binomial(alpha = 6, beta = 18, y = 15, n = 50)
## model alpha beta mean mode var sd
## 1 prior 6 18 0.2500000 0.2272727 0.007500000 0.08660254
## 2 posterior 21 53 0.2837838 0.2777778 0.002710007 0.05205773
The posterior model for π is Beta(21,53).
The mean is 0.2837838, the mode is 0.2777778, and the standard deviation is 0.05205773.
The posterior model more closely reflects the data than the prior information since the overlap between the likelihood binomial distribution and posterior distribution is larger than that between the prior beta distribution and posterior distribution as illustrated in part b.
# E(π)=α/(α+β)≈0.15, so α/β≈3/17
# π is between 0.10 and 0.25
# In a trial and error process, I use plot_beta() in the bayesrules package to plot Beta models with α and β pairs that meet this constraint (e.g., Beta(3,17), Beta(6,34), Beta(9,51)). Among these, I find that the Beta(39,221) closely captures the typical outcomes and variability in the estimated chances.
plot_beta(39,221)
Beta(39,221)
summarize_beta_binomial(39,221,30,90)
## model alpha beta mean mode var sd
## 1 prior 39 221 0.1500000 0.1472868 0.0004885057 0.02210217
## 2 posterior 69 281 0.1971429 0.1954023 0.0004509332 0.02123519
The posterior model for π is Beta(69,281).
As shown in the part (b), the posterior mean of π is 0.1971429, the posterior mode of π is 0.1954023, and the posterior standard deviation of π is 0.02123519.
plot_beta_binomial(39,221,30,90)
The posterior model more closely reflects the prior information more than the data since the overlap between the likelihood binomial distribution and posterior distribution is smaller than that between the prior beta distribution and posterior distribution as illustrated in the plot above.
# According to the description, it is likely that in 2020s, E(π)=α/(α+β)>0.3, so α/β>3/7
# π is between 0.35 and 0.60
# In a trial and error process, I use plot_beta() in the bayesrules package to plot Beta models with α and β pairs that meet this constraint (e.g., Beta(60,63), Beta(63,70), Beta(65,77)). Among these, I find that the Beta(65,77) closely captures the typical outcomes and variability in the estimated chances.
plot_beta(65,77)
b)
summarize_beta_binomial(65,77,80,200)
## model alpha beta mean mode var sd
## 1 prior 65 77 0.4577465 0.4571429 0.001735767 0.04166253
## 2 posterior 145 197 0.4239766 0.4235294 0.000712013 0.02668357
The posterior model for π is Beta(145,197).
plot_beta_binomial(65,77,80,200)
c) As shown in the part (b), the posterior mean of π is 0.4239766, the posterior mode of π is 0.4235294, and the posterior standard deviation of π is 0.02668357.
# In the prior model of π, α=2, β=3
# Since the posterior model of π given an observed Y=y successes in n trials is Beta(α+y,β+n−y), 2+y=11, y=9, 3+n-9=24, n=30.
summarize_beta_binomial(2,3,9,30)
## model alpha beta mean mode var sd
## 1 prior 2 3 0.4000000 0.3333333 0.040000000 0.20000000
## 2 posterior 11 24 0.3142857 0.3030303 0.005986395 0.07737179
# In the prior model of π, α=1, β=2
# Since the posterior model of π given an observed Y=y successes in n trials is Beta(α+y,β+n−y), 1+y=100, y=99, 2+n-99=3, n=100.
summarize_beta_binomial(1,2,99,100)
## model alpha beta mean mode var sd
## 1 prior 1 2 0.3333333 0.000000 0.0555555556 0.23570226
## 2 posterior 100 3 0.9708738 0.980198 0.0002719027 0.01648947
The prior model compare the plausibility of different values of π: the greater f(π), the more plausible the corresponding value of π. It indicates that π is most likely around 0.98 but could reasonably range from 0.875 to 1. The likelihood function provides insight into the relative compatibility of the observed data with different π∈[0,1]. The fact that L(π|y) is maximized when π=0.21 suggests that Y=y number of n trials is most likely when π=0.21. The further that a hypothetical π value is from 0.21, the less likely we would be to observe data result, L(π|y) effectively drops to 0 for π values under 0.03 and above 0.5. Thus, it’s extremely unlikely that we would’ve observed a y/n rate in the trials if, in fact, π were as low as 3% or as high as 50%. Overall, the prior model indicates a higher plausibility of higher values of π than the likelihood function does.
The posterior model indicates that π is most likely around 0.75 but could reasonably range from 0.625 to 0.875. It more closely agrees with the prior than the data.
# For the prior model, it is likely that E(π)=α/(α+β)≈0.98, so α/β≈49.
plot_beta(98,2)
# For the likelihood function, it is mostly likely that y/n≈0.21 and 0.03 < y/n <0.5.
plot_binomial_likelihood(21,100)
# For the posterior model, π is between 0.625 and 0.875.
plot_beta_binomial(98,2,21,100)
The prior model compare the plausibility of different values of π: the greater f(π), the more plausible the corresponding value of π. It indicates that π is most likely around 0.5 but could reasonably range from 0 to 1. The likelihood function provides insight into the relative compatibility of the observed data with different π∈[0,1]. The fact that L(π|y) is maximized when π=0.1 suggests that Y=y number of n trials is most likely when π=0.1. The further that a hypothetical π value is from 0.1, the less likely we would be to observe data result, L(π|y) effectively drops to 0 for π values under 0 and above 0.375. Thus, it’s extremely unlikely that we would’ve observed a y/n rate in the trials if, in fact, π were as low as 0 or as high as 37.5%. Overall, the prior model indicates a higher plausibility of higher values of π than the likelihood function does.
The posterior model indicates that π is most likely around 0.14 but could reasonably range from 0 to 0.4375. It more closely agrees with the data than the prior.
# For the prior model, it is likely that E(π)=α/(α+β)≈0.5, so α/β≈1.
plot_beta(3,3)
# For the likelihood function, it is mostly likely that y/n≈0.10 and 0 < y/n <0.375.
plot_binomial_likelihood(5,45)
# For the posterior model, π is between 0.625 and 0.875.
plot_beta_binomial(3,3,5,45)
summarize_beta_binomial(3,3,30,40)
## model alpha beta mean mode var sd
## 1 prior 3 3 0.5000000 0.5000000 0.035714286 0.1889822
## 2 posterior 33 13 0.7173913 0.7272727 0.004313639 0.0656783
plot_beta_binomial(3,3,30,40)
b)
summarize_beta_binomial(3,3,15,20)
## model alpha beta mean mode var sd
## 1 prior 3 3 0.5000000 0.5000000 0.035714286 0.18898224
## 2 posterior 18 8 0.6923077 0.7083333 0.007889546 0.08882312
plot_beta_binomial(3,3,15,20)